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Abstract 



In this work, the focus is on the improvement of the existing post-Newtonian ap- 
proximation for the gravitational flux from Super Massive Black Hole Binaries. In order 
to improve the existing templates for LISA, we need more accurate post-Newtonian ex- 
pansions for the gravitational flux. Stochastic search techniques like the Markov Chain 
Monte Carlo (MCMC) have been used extensively for searching for sky parameters etc. 
The idea is to combine the two and approach the problem of finding post-Newtonian 
coefficients using MCMC. It has been shown that matching against a 5.5PN signal, 
with noise, the last coefficient can be found by MCMC very easily and displays fast 
convergence. Also the space for higher dimensional searches are explored. 
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Chapter 1 



Introduction 

1.1 Background 

Mass in nonspherical and nonuniform motion is the source of ripples in curved 
spacetime, which propagate away at the speed of light. These propagating ripples in 
spacetime curvature are called gravitational waves (GW). The weak coupling to mat- 
ter is what makes GW so difficult to detect, and so astrophysically interesting. Once 
produced, little is absorbed, unlike electromagnetic radiation. GW, in principle, could 
enable us to see closer to the horizon of a black hole and to earlier moments in the 
universe than with any form of electromagnetic radiation. 

Laser Interferometer Space Antenna (LISA), a joint venture of ESA and NASA, 
would be capable of detecting gravitational waves in the frequency band of 10~ 5 Hz < 
f < 1 Hz. Promising a future direct observation of gravitational radiation, it would 
test one of the most fascinating predictions of General Relativity, and, at the same time, 
becoming a new powerful tool in the astronomical investigation of highly relativistic 
catastrophic events, such as the merging of compact binary systems and the collapse of 
massive stellar cores. The inspiral and merger of massive black hole binaries (MBHBs) 
are a source of the strongest gravitational wave (GW) signals in LIS As frequency band, 
specifically in the sub mHz range. This frequency band, is not observable by the ground 
based detectors, because of a few reasons, such as the interference from the tectonic 
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motion below around 10Hz, and, the time varying gravitational potentials such as the 
weather systems etc. 

The large signal-to-noise ratios (SNRs) of these signals will allow LISA to not 
only detect, but measure physical parameters of the source systems. This would hold 
immense potential to probe questions regarding the role played by black holes in struc- 
ture formation and galactic dynamics. Estimates of the precision with which LISA can 
extract waveform parameters have been developing for the past several years, in par- 
allel with progress in describing MBHB waveforms. The parameters are extracted, in 
essence, by cross - correlating the signal (with noise) with theoretical templates. The 
waveforms from inspiraling binaries are calculated using the post-Newtonian expansion 
of Einstein equations, which assumes that the system comprises of slow moving bod- 
ies. The post Newtonian approximations, however, break down at or near the merger 
and are not expected to give accurate description of the merger waveform. So I rely on 
the part of the signal that extends till some time before that. Also, if the template and 
the signal lose phase with each other by even one cycle, their cross correlation signifi- 
cantly reduces. This means, that, the theoretical templates would have to be better than 
one cycle during an entire sweep through LISA'S band. Althought the post-Newtonian 
calculation technique is being developed to apply to higher order calculation, it would 
be productive if there were a faster and accurate way to obtain the higher order post- 
Newtonian corrections. 

In this thesis, I look at a system of non-spinning extreme mass ratio SMBHB 
inspirals, and the idea is to let stochastic search methods like Markov Chain Monte 
Carlo (MCMC) to search for the post-Newtonian coefficients beyond the 5.5PN. We 



use the form of the function as in [31 1, and also aim to validate the same. 
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1.2 Motivation 

This thesis, is the summary of the work I did and things I read during my visit 
at the Max Planck Institut fur Gravitationsphysik, Potsdam. The guidance of Dr Badri 
Krishnan, and Dr Edward Porter was extremely valuable, in terms of giving direction 
to my learning. 

Stochastic search methods, like the Markov Chain Monte Carlo, have been used 
for estimation of parameters describing various gravitational waves' sources, which 
are going to be observed by LISA fj\. This is done by matching the observed signal 
against the pre-evolved theoretical templates. The stochastic search is used in exploring 
the massive banks of these templates. Extremely accurate post-Newtonian waveforms 
to create these template banks. 

The inspiration of the thesis crystallized around the idea of letting stochastic 
search find the higher order coefficients to the existing post-Newtonian forms. And in 
seeing, for instance, that with the knowledge of theoretically derived post-Newtonian 
coefficients, if the Markov Chains could find the higher order coefficients. 

1.3 Thesis Organization 

The thesis is organized into six chapters. 

The first being the Introductory chapter, the second talks briefly about the General 
Theory of Relativity, the meaning of curvature, Riemann tensor, the Einstein's Equa- 
tions and also covers the Schwarzschild Geometry, Gravitational waves, the Transverse- 
Traceless gauge simplfication. A brief derivation of Newtonian gravitational flux, from 
the quadrupole moment formula is also presented. 

The third chapter, would focus on describing the mathematical modelling of the 
LASER Inerferometer Space Antenna (LISA), and the form of the gravitational wave- 
form as observed by it. 
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The fourth chapter, focusses on the Markov Chain Montel Carlo method, which 
is the primary algorithm I had chosen for all the searches. Here, I would illustrate the 
main search technique. This would comprise of an introduction of the method, followed 
by the most of the 'how' of what I did. 

The fifth chapter would state the results obtained, supported with some explana- 
tion, and the sixth chapter would present the Conclusion, and the scope for future work 
on this. 



Chapter 2 
Gravitational Waves UTTfl (26|] 



A black-hole is a region in spacetime in which the gravitational field is so strong 
that it doesn't allow even light to escape to infinity. It is formed when a body of mass M 
contracts to a size less than, what is commonly refered to as, the gravitational radius 
r g = 2GM/c 2 . The velocity required to leave the boundary of the black-hole and 
escape equals the speed of light. As the speed of light is the limiting propagation 
velocity for physical signals, its obvious that absolutely nothing can escape from the 
region inside the black body. 

In a short time following its formation, a black-hole becomes stationary and its 
field can be uniquely described with the knowledge of its mass, angular momentum 
and its electric charge (if charged). This is because, in the extremely strong field of the 
black-hole, only very special configurations of physical fields (including gravitational 
field) can be stationary. Einstein's equations give the description of the field around the 
black-hole. 

Since no signals can escape a black-hole, but physical objects and light can fall 
into it, the spacetime surface (event horizon) of a black-hole is light-like. From this it 
follows that processes involving a black-hole would be irreversible. If it remains iso- 
lated, and eventually attains a stationary state. To understand the stationary geometry, I 
would like to discuss certain more basic things in brief. 
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2.1 Curvature in spacetime 

A line element specifies the geometry of spacetime. Its expression depends on 
the coordinate system employed, but the essence remains the same. 

For instance, flat spacetime geometry can be described, in Cartesian coordinate 
system, by: 

ds 2 = -c 2 dt 2 + dx 2 + dy 2 + dz 2 
or in polar coordinate system as: 

ds 2 = -c 2 dt 2 + dr 2 + r 2 d6 2 r 2 sin 2 6d(f) 2 

In general, a metric is used to define the geometry. If x a represents the points in space- 
time, the line element joining nearby points, has its length ds given by the following 
expression: 

ds 2 = g a/3 (x)dx a dxP (2.1) 

In this fashion, the flat space time, would have its metric as (dx° = cdt) diag(—l, 1,1,1) 

in Cartesian coordinates, and diag(—l, 1, r 2 , r 2 sin 2 8) in polar. This is also known as 

the Minkowski metric of flat spacetime, and typically represented by r] a p 

Einstein's equivalence principle states that: 

All test particles at the same spacetime point in a given gravitational field will undergo the 

same acceleration, independent of their properties, including their rest mass, and, The out- 
come of any local non-gravitational experiment in a laboratory moving in an inertial frame 
of reference is independent of the velocity of the laboratory, or its location in spacetime. 

Here local has a very special meaning: not only must the experiment not look 
outside the laboratory, but it must also be small compared to variations in the gravita- 
tional field, tidal forces, so that the entire laboratory is moving inertially. 

This implies that the local properties of curved spacetime would be essentially 
Minkowskian. This is an extremely important concept, as it implies that at any point 
in spacetime in curved spacetime (described by the metric g a p(x)), we can always 
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introduce a new system of coordinates x' a such that: 

9 a p(x) = Vap (2.2) 

Also, in curved spacetime, vectors is a concept that is defined locally. It is a 
concept that was used in the flat spacetime, and since the equivalence principle suggests 
that the local properties of curved spacetime are Minkowskian, it becomes a concept 
defined locally at each point. This means, that at a point, normal vector operations are 
valid, but between vectors defined at different spacetime points, they are not. For that, 
they have to be first parallel-transported to the same point first. This means, that vectors 
at different points cannot be added or compared, just like that. Vectors are expressed 
in terms of locally defined coordinate bases. The two most commonly used bases are 
the orthonormal bases and the coordinate bases. In the latter, the unit vectors are such 
that: 

e a (x)ep(x) = g a p(x) (2.3) 



2.1.1 Geodesies 

How a test particle or a light ray moves in a curved spacetime, would tell us 
how the curved spacetime is curved. If a test mass is introduced into the physical 
scenario, it would inevitably move according to the curvature produced by other bodies 
with significant masses. This path is called a geodesic. In every local Lorentz frame, 
this curve would appear to be straight and uniformly parameterized. In other words, a 
geodesic is a curve C(A), that parallel-transports its tangent vector u = ^ along itself, 
ie: 

V u u = (2.4) 
And it leads to the geodesic equation : 



d\ 2 + ^ d\ d\ " u { V 
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The are the connection coefficients, also known as the Christoffel symbols, 
and A is called the affine parameter (parameterizing the geodesic). 

A few properties of the Christoffel symbols are important, that they can be taken 
as symmetric in the lower two indices, ie: 

r^ 7 = (2.6) 

and its expression in terms of the general metric (and its derivatives) is: 

5 _ 1 fdg a p dg ai dg M \ 

2 [ dx , + Q X P dx a) V-0 

This expression, along with various symmetries of the spacetime, can be easily 
juggled with, to evaluate the expressions for the Christoffel symbols. Given an initial 
location in spacetime, and an initial four-velocity, the geodesic equation can be eas- 
ily integrated numerically to find the location and four-velocity at later moments of 
proper time. An important expression, employed to reduce the order and the number of 
equations is that of taking the four- vector dot product of velocity with itself. 

dx a dx 13 

u-u = g a /3-^—-r- = -1 
dr dr 

2.1.2 Riemann Curvature 

The motion of two test particles is enough to detect spacetime curvature. By 
studying the relative motion of two nearby test particles, can a person detect the local 
curvature of the spacetime. By nearby, I mean particles traveling on infinitesimally 
seperated geodesies. Let four-vector x denote the infinitesimal displacement between 
two nearby geodesies. The expression for V u V u x would give the acceleration of the 
seperation vector. Which would give the local curvature of spacetime p6| states this 
relation as: 

(VuVuxT = -R^x'u 5 (2.8) 
which leads to the expression for the Riemann curvature tensor: 

(2.9) 



f)V a f)V a 

pa _ I3S _ wl 0j | pa -pe _ -pa pe 

Pi 6 ~ dx y dx 5 7 135 St M 
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Often, its fully covariant form R a i3ys(= g at jR a p~s) * s a ^ so re f ere d to as the 
Riemann curvature tensor. 

The curvature of spacetime at each point is completely described by this mul- 
tilinear operator, which has 20 algebraically independent components at each point. 
The components of the Riemann tensor identically satisfy a differential equation (the 
Bianchi identities [37|) and certain symmetries 2. 10[ which is why the metric tensoi }2.1 



(which has ten algebraically independent components at each pointt) is enough to com- 
pletely determine the Riemann curvature tensor. 

(2.10) 



R, 



afiyS i ''-add"; 



Ra/SyS 


RfiayS 


RafiyS 


Ra/38y 


Raft^S 


= +R 1 5al3 


+ Ra-y8fS 


= 



For instance, in the local inertial frame, where g a p = T] a p, the Riemann curvature 
simplifies to: 

= 1 / d 2 g aS d 2 g ai d 2 g 5p d 2 g M \ 
aM6 2\dxPdx-y dxPdx 8 dx a dx-r dx a dx s J ' ) 

Another useful tensor, the Ricci curvature tensor, is defined as: 



(2.12) 
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2.1.3 Einstein's Equations 

Finally, I am in a position to introduce the Einstein tensor and the Einstein equa- 
tion. As argued in MTW, the frame independent stress-energy tensor T must act as 
the source of gravity. To link it with the geometry of spacetime, T would need to be 
expressed in terms of a tensor that describes in some way, the geometry of spacetime 
which is a result of gravity. 

This is the Einstein tensor G that we are talking about, and it must be that: 

1. G vanishes when spacetime is flat. 

2. G is constructed from the Riemann curvature tensor 12.91 and the metric 12.11 
only. 

3. G has to be linear in R, to be substantiated as a measure of curvature. 

4. G must be symmetric, second rank, have a zero divergence. 
Finally, G is defined as (with due justification): 

Gfiu = Rfiv — -^QnuR (2.13) 

where R = g^R^ is the curvature scalar. And, this leads to the Einstein field equa- 
tions: 

G = 8?rT (2.14) 

The Riemann tensor R a ^s can be decomposed into two pieces, the Ricci tensor 
and the Weyl tensor C a j3 y s, in a manner analogous to decomposing a matrix into trace 
and tracefree parts. The Riemann, Ricci, and Weyl tensors all have geometric meaning 
independent of any physical interpretation. 

Physical meaning enters via the stress-energy tensor T which can be thought 
of as a 4x4 symmetric matrix (so it has 10 algebraically independent components at 



11 



each point). This tensor completely describes the amount of (non-gravitational) mass- 
energy at each point, and also any momentum (mass-energy flow) and stresses (such as 
the pressures in a fluid). 

The Ricci curvature is directly coupled to the immediate presence of matter at a 
given point. If there is no mass-energy at a given point, the Ricci tensor vanishes. 



2.2 Spacetime around a black-hole: the Schwarzschild 
Geometry 

The Schwarzschild geometry describes the spacetime geometry of empty space 
surrounding any spherical mass. Karl Schwarzschild derived this geometry in 1915, 
within a few weeks of Albert Einstein publishing his fundamental paper on the Theory 
of General Relativity. Due to the symmetries, this is a case when analytical solution to 
the Einstein field equation can be obtained, and this geometry is precisely the geometry 
given by the solution of Einstein equations in vacuum. 

As stated earlier, the line element completely represents the geometry. The line 
element summarizing the Schwarzschild geometry is given by: 



ds 2 = -d- ^M) (cdt) 2 



c 2 r 



'J 



(l - ^jf^J {cdt) 2 + r 2 (d6 2 + sin 2 



The coordinates are called Schwarzschild coordinates, and the corresponding g a/3 is 
called the Schwarzschild metric. The important properties of the metric are that it is 
time independent and spherically symmetric. Also, it is determined by a single param- 
eter M, which is the total mass of the gravitational source which produces the field. 

The proper time, is given by dr = yj—go® — ^1 — dt. 

Also, the coordinate r is not physically the distance from any origin. Like, as 
r — > oo, the proper time — > dt, ie the geometry tends to become Minkowskian. But, as r 
becomes progressively smaller and approaches the value 2 » me proper time interval 
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decreases. As the Schwarszchild coordinate r comes closer to o , the time in the 

c 

local Lorentz frames passes very very slowly. This value is called the Schwarzs child 
radius. 

2GM 

r„ = 



c 2 



In fact, Hartle p2| says: [..It (r) is related to the area A of the two-dimensional 
spheres of fixed radius r and t by the standard formula r = (A/ An) 1 / 2 ..] 

2.2.1 Trajectories in Schwarzschild spacetime: the LSO 

For schwarzschild spacetime, g a p is independent of time, and of </>. 
For a test-mass, E = —p /m, and L = p^/m. This means p and p^ would be 
constant for a trajectory. So would E and L be. Now, for a particle: 



p r = mdr jdr 



p e = (fixing the trajectory plane) (2.16) 



p* = g"% = - 2 L 

r 



Using the equation pp = — m 2 with the above values yields: 

dr\ 2 ^ ( ^ 2GM\ ( \ , L 2 



a; + {2A1) 

which translates to: 

, fdr\ 2 ( 2GM\( L 2 \ 

E 2 =[-r) + 1 1 + (2-18) 



\dr J \ r J \ r 2 
giving the effective potentials as: 

2/ , fdrV ( 2GM\ ( L 2 \ 
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The total E must remain constant. So, 2.18 can be differentiated with respect to 
the coordinate r, which will give the relation: 

d 2 r 1 d 



dr 2 



2dr 



V\r) 



(2.20) 




Figure 2.1: Typical effective potential for a massive particle of fixed specific angular 
momentum in the Schwarzschild metric. 



This physically means, that an orbit in this geometry can be stable only at an 
externum of V 2 (r), as in Fig. 2.1 Putting ^V 2 (r) = 0, and using 

L 2 ' 



2.18 



d 
dr 



2GM\ , 
1 | 1 + 

r / 







(2.21) 



Expressing in terms of geometrized units, where c = 1 & G = l(dimensionless) , 
this operation leads to: 



L 2 ( 



2M 



M 1 



12M J 
L 2 



(2.22) 



Note: I will be using geometrized units henceforth.) 

This means, two values of r exist for a given constant L (only when L > 12M 2 ). 
If, however, the L is lower than the minimum value of \fL2M, the particly does insuffi- 
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cient angular momentum to stabilize at any orbit, and continues on into the black-hole 
and plunges. 

As there is a minimum value of L for a particle in stable circular orbit, there has 
to be a minimum r also. This value of r is know as the Last Stable Orbit (LSO). 



The period for a particle in a stable circular orbit around a black-hole would 
have a time-period of 27r v /(r 3 /M). This expression, coincidentally matches with the 
Newtonian expression. 



Post-Newtonian expansions in general relativity are used for finding an approx- 
imate solution of the Einstein equations for the metric tensor that represents a multi- 
component, tensor gravitational field potential instead of a single, scalar gravitational 
potential in the Newtonian gravity. In the limit, when the velocities involved are small 
as compared to c, the post-Newtonian expansions degenerate and the Einstein theory of 
general relativity is reduced to the Newtonian-like theory of gravity with the instanta- 
neous action-at-the-distance gravitational field interaction. 

Consider a system of particles that are bound together by the mutual gravitational 
forces. The post-Newtonian approximation may be described as a method for obtaining 
the motions of the system to higher powers of the parameters GM/f and v 2 , than given 
by Newtonian mechanics. It is sometimes referred to as an expansion in inverse powers 
of the speed of light. But I'll take c as unity. So, it becomes an expansion in v 2 . 

The equation of motion for a particle is: 



T LSO = 6M 



(2.23) 



2.3 the Post Newtonian Approximation 



d 2 x^ 




dx v dx 



(2.24) 



dr 2 



+ 



dr dr 
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From this the acceleration becomes: 



dt 2 



dt 
dr 



— r l 

1 v\ 



dr 
dx v dx x 
dt dt 



dx % 
dr j 
(ix^ <ix A fix* 
(it (it (it 



Writing it explicitly, in a way where we would be able to make certain approxi- 
mations more evidently; 



d 2 x l 
"dt 2 



dx % . dx % dx k 

Loo ~ ZLoi ~dt~ L *~dt~dF 



+ 



rS„ + 2r S — + r» t --j- ( 2.25) 



What we need is a systematic approximation method, that will not rely on any 
assumed symmetry property of the system. One important such method is the post- 
Newtonian approximation W\. It is adapted to a system of slowly moving particles, 
bound together by gravitational forces. I would try to briefly demonstrate how this 
systematic assumption works. 

i -n 

In the case discussed above, the objective is to compute y ^p- to order ~r. So the 
various components of the affine connection would be needed to the following orders. 



to order 



r 0j &r° to order - 



r 



r 



v n ~ 2 

T) k kT° 0j to order 



T° jk to order - 



r 



It is always possible to find a coordinate system in which the metric tensor is 
nearly equal to the Minkowski tensor [|38|, and the corrections are expandable in powers 
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of GM/r w v 2 . This gives: 

g 00 = -1 + 5-00 + ^00 + ••• 
On = Sij + 9ij + gfj + ■■■ 
9m = 9io + 9io + • • • 

where, gfg denotes gij to order n. 

Using these approximations, the components r oo , r*- fc , can be expanded as: 

Kx = rS M + rif + . • • (2.26) 

and, the components T^, Tq , r°- can be expanded as: 

Kx = I? A )M + r®» + ... (2.27) 

Thus working, we'd be able to get the order to which the metric is to be evaluated, 
as the affine connection (Christoffel symbols) can be calculated from the metric alone 



[2.7 1 . So can the Ricci tensor. 



2.4 In the Transverse - Traceless gauge 

Let the trace-reverse form of h be defined as: 

2 1 

Its called the trace-reverse form because its trace is equal to the negative of the trace 

of h. The weak- field approximation (\h a p\ < 1) to gravitational field, leads to the 
weak-field Einstein equations: 

( ~ Jj£ + y2 ) ^ = ~ l ^T aP (2.28) 
For vacuum (T a/3 = 0), this has a known solution of the form: 

h a(3 = A afS e lkaXa (2.29) 
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As the coordinate system that is chosen is arbitrary, its extremely simplifying 
to impose certain restrictions on them. For approximate solutions to the weak-field 



Einstein's equations [2.28|, the gauge can be changed using a vector that satisfies the 



following equation and still not affect anything. 

d 2 



A known solution to the equation is: £ a = B a e %k ^ , where B a is a constant and is 



the same as in [2.29 1. This produces a change in h ap , given by: 

t(new) _ t(old) f p w n ~ n . 



and for the [Eq |2.29 |'s constant changes as: 



A (NEW) = A (OLD) _ Wak ^ _ iBfjK + lrtapB ^ 

Now, if the following conditions are imposed to the selection of the coordinates, 
immense simplification is attained: 

A a a = 

A aP k p = (2.31) 

AapUf* = 

where U is an arbitrary constant timelike unit vector. Of course, these conditions are 
imposed alongwith the basic Lorentz - gauge condition (h^ = 0). These conditions 
are called the Transverse - Traceless (TT) gauge conditions. Traceless because the trace 
of A vanishes. Transverse because A^ is 'across' the direction of propagation. The 
simplification that is attained here, is that in this frame: 

AH haS "MSJ. A ly ± 0. 

All the remaining components vanish. ! Thus there are only two independent compo- 
nents. The easiest way to write them explicitly is to orient the spatial coordinates so 
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that one axis is along the propagation of the waves. The A^J and represent the two 
polarizations of the gravitational wave. The former is usually called the + (plus) polar- 
ization, and the latter is referred to as the x (cross) polarization. The general solution 
of the weak-field equations is a superposition of the two polarizations. 

To transform to the TT gauge, there are very simple relations which essenially 
set all the nontransverse parts of the metric equal to zero, and subtract out the trace 
from the remaining diagonal elements to make it traceless. For example, for a wave 
propagating in the z- direction, 

h TT = _ h TT = l (h _ h y h TT = h 



2.5 the Newtonian Flux g (3) 

The most simplistic representation of a system emitting gravitational waves can 
be through its quadrupole moment. The origin of this name rests in the analogy with 
electromagnetism. The electromagnetic field is a vector field, and so electromagnetic 
waves can be generated by vector sources, such as an electric current. This means 
that a dipole source is sufficient (a dipole can be described by a vector). Gravity, on 
the other hand, is a tensor field, and the source must contain more components than 
a dipole (vector) to simulate it. A tensor can be regarded as a conjunction of two 
vectors JlOj, so the source must be at least as complicated as two vectors. The simplest 
such arrangement is the quadrupole, consisting of two opposed vector dipoles. The 
resulting field pattern will reflect this more complicated arrangement (sin 2 9 rather than 
sin 9 angular field dependence). 

The quadrupole moment tensor of the mass distribution is given by: 

and its trace - free representation is given by lj k = Ijk — \$jkl\ 
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In the TT gauge, the solution to Equation |2.28| in terms of this quadrupole mo- 
ment is given by: 

hll = J[^,oo(*-0-Coo(*-r)] (233) 

For an isolated system which is emitting gravitational waves, with is the fre- 
quency of the oscillations of the time varying part of the T M „ tensor (assumed sinu- 
soidal), its net gravitational flux at a distance r along the z axis is given as: 

" [Up T l > ~ Mn%I T k * + n^n k n k lTQ (2.34) 



167rr 2 



The total gravitational flux emitted by the source is the integral of this over the sphere 
of radius r. 



which gives 

L=^n 6 (lp Tij ) (2.35) 

and for a general time dependence of T (till now, T was assumed to be sinusoidal with 
frequency f2), 

L -\Qv Ti ')^(Jv Tt ') <"« 

With all this pre-information, an analysis of a binary system can be made. Mas- 
sive black-hole binaries are the systems that are ultimately being looked at. And the 
frame is the centre-of-mass frame. Let y\t and y 2 {t) be the two trajectories of the 
masses m x and m 2 , and y = y[ — y 2 , and r = \y\. The velocities Vi(t) = 

The Newtonian equations of motion give: 

dv\ Gm 2 _ dv*2 Gm\ _ 
dt r 3 ^' dt r 3 ^ 

which gives the relative acceleration as ^ = — ^-^y 
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A simple way of evolving the phase of the gravitational waves emitted by a black- 
hole Binary system, is to use the energy balance equation. The loss of the centre-of- 
mass energy is balanced by the total energy emitted as gravitational flux. In the case of 
circular orbits, this approach is sufficient, as its only needed to find the decrease of the 
orbital separation r. 

dE - L 
~dt ~ ~ 

where, E = — ( ^ rr ^ 1 7 m ' 2 (hence the evolution of r from this). 



The outgoing energy is the L that was defined in Eq. 2.36 Here the quadrupole 
moment is (/i = m^j/ (mi + m 2 ) & m = mi + m 2 ); 

Ifj = A*(vV - l^r 2 ) (2.37) 

The third derivative needed to calculate the total flux, is easily given by differen- 
tiating the above equation. 



d 3 llj A Grnji 



dt 3 r 3 



4— ^(j/V+yV) (2.38) 



Replacing this expression into Eq. |2.36| leads to the 'Newtonian' flux: 

-v 2 (2.39) 
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It is better expressed in terms of a orbital frequency parameter, x 



which is of the order 0(l/c 2 ) in the post-Newtonian expansion. Putting in Kepler's 
Law Gm = r 3 f2 2 the expression for x, and rj = mim 2 /m 2 , gives a succinct definition 
of the Newtonian flux: 

L=~V 2 x 5 (2.40) 
This is only the Newtonian expression for the flux. Successive Post-Newtonian 
approximations provide for more accurate expressions. They are described in the next 
chapter. The core idea of this whole work, is to try to evolve the Post-Newtonian expres- 
sion using stochastic search methods. The idea germinates from the juxtapositioning 
of the facts that the field of detection of gravitational waves is replete with usage of 
stochastic searches, and this is an extension of it in a different direction. 



Chapter 3 



LISA 

3.1 Introduction 

LISA is made of three drag-free spacecrafts, arranged as the vertices of an equi- 
lateral triangle, with the length of each side 5 x 10 6 km. The center of mass of the 
triangular arrangement follows the Earth in an orbit around the Sun, 20 degree behind 
the Earth. The plane of this triangular arrangement makes an angle of 60 degrees with 
the ecliptic. As each of the individual spacecrafts are on different planes, the whole 
triangular arrangement rotates about itself once every orbital revolution, i.e. with the 
period of a year. 

The configuration of LISA is such that the triangular arrangement of detectors 
can be analyzed as a pair of orthogonal two-arm detectors. The motion of the detec- 
tors around the Sun introduces a periodic Doppler shift whose magnitude n phase are 
dependent on the sky position of the source. 

If we assume that there are no fluctuations in the arm length, ignore the signal 
cancellation due to the LISA transfer functions, assume that we can have the data from 
all three spacecrafts simultaneously, ignore the problem of pointing ahead, and the 
frequency of interest is less than the transfer frequency of the detector, i.e. / << 
/* ~ 1CT 2 Hz, we are justified in making use of the results of the Low Frequency 
Approximation (LFA) [|5J. Of course, as the wavelength becomes comparable to the 
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arm length, the response of the detectors would involve fluctuations of arm length, 
pointing ahead and the aforementioned signal cancellation, but for our purposes the 
LFA would suffice fl4j. 

3.2 Mathematical Model 

In the Low-Frequency Approximation, the strain at the detectors (a combination 
of polarizations weighted by the beam pattern functions, taking into account the mea- 
surements at both detectors) due to an incoming GW with polarization h +)X (£) is: 

hit) = h + { X (t))F + + h x ( X (t))F x , (3.1) 
where: x(t) = t — R® sin dcos (a (t) — 4>). 

Here, _R e = 1AU ~ 500secs is the radial distance to the detector guiding center, 
(6, 4>) are the angular coordinates of the source in the sky, a(t) = 2-Kf m t + k, f m = 
1 1 'year is the LISA modulation frequency and k gives the initial ecliptic longitude of 
the guiding center. 

In the LFA, the beam pattern functions are essentially a quadrupole antennae. 
They are defined as: 

F+(t) = l[cos(2ij)D + (t;6 } <j),\) - sin(2^) J D x (t; 0, 0, A)], (3.2) 
F x (t) = l[sm(2^)D + (t;6 l( f),\) + cos(2^)D x (t;6,<j),\)}. 

Here, ift is the polarization angle of the wave. Formally, if L is the direction of 
the binary's orbital angular momentum, and h is the direction from the observer to the 
source ( 180° to the direction of GW's propagation), then ip fixes the orientation of the 
component of L perpendicular to h. The time dependent quantities D +,x are given in 
the LFA by [|4j as: 
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D+(t) 



64 



36 sin 2 (#)) sin(2a(t) - 2A) + (3 + cos(2#)) 



cos(20){9sin(2A)-sin(4a(t) - 2A)} + + (sin(20){cos(4a(t) - 2A) - 9cos2A))} 
- 4\/6sin(20) ^ sin(3a(t) - 2A - 0) - 3 sin(a(t) - 2A + 0) J 



(3.3) 



D x (t) 



v 7 ^ cos(#) ^9 cos(2A - 20) - cos(4a(t) - 2A - 20) 
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6 sin(0) cos(3a(t) - 2A - 0) + 3 cos(a(t) - 2A + 



Here, A = 0, n give the orientation of the two detectors. The GW polarizations up 
to 2-PN order in amplitude corrections is defined by [4|. 

3.3 Likelihood Estimator 



Suppose we could create a filter which would output the autocorrelation function 
of the input. For such a filter, the tranfer function |35] would have to be the com- 
plex conjugate of the input (in the frequency domain). However, as it turns out to be 
noncausal, its impossible to make a real filter like this. However, a filter with some fre- 
quency delay can be constructed, and would be causal too. This is called the matched 



filter pT| (T8J. Its so named, because the filter is said to be matched with the input as it 
outputs input's autocorrelation function (with some delay, of course). While matching 
a dummy signal evolved from theoretical templates agains the actual received signal, 
this is the kind of filtering that is the first idea that comes to one's mind. Towards the 
construction of such a mathematical tool that can be used for this purpose, first the 
Natural Scalar product is defined as follows. This, mentionably, involves division of 
the product of the signal and the template by the spectral density of the noise (all in 
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fourier domain). This indicates, that for the frequencies the noise is high, the 'match- 
ing' or vice-versa of the signal and the template should be given less weightage, and 
vice versa. 



\s) = 2 



OO 



df 



lo S n (f) 

1 /2 

with vector norm \h\ = (h\h) 1 . 



h(f)s*(f) + h*(f)s(f) 



(3.4) 



/oo 
h(t)e 2nLft dt (3.5) 
-oo 

is the Fourier Transform of the time domain waveform h(t). S n (f) is the one- 
sided noise spectral density of the detector, and will be defined in the next subsection. 

Let the signal, in each detector, be Si(t) = hi(t)+rii(t),iis the index of the detector. 
Assume that the noise rii(t) is stationary, Gaussian, uncorrelated in each detector and 
its spectral density is given by S n (f). Then, we can define the SNR to be: 

(his-) 

(SNR)i = is the index of the detector (3.6) 

Given a signal s(t), the likelihood that the true PN coefficients are given by is 
given by 



L(p») = c e -(-M£i)[«-*(£i)>/2 (3 ?) 

where C is a normalization constant. From which also follows the reduced log- 
Likelihood, given as: 

lnL(^) = (s\h$l)) - l -{h(f n )\h{f n )) (3.8) 
Log-Likelihood is going to play the central role in the jump - evaluation of the 



MCMC. There, the aim would be to maximize in the loglikeliood space 3.7 In the 
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high SNR limit, the error in the determination would be la 2 . The SNR and the log- 
Likelihood are related by: 

InL « SNR 2 /2 (3.9) 



3.4 Detector Noise 

The Noise Spectral Density, has two chief components. The instrumental noise, 
and the confusion noise. To model the instrumental noise, the expression for the stan- 



dard one-sided noise spectral density for the LISA is used [ [15 1 
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(3.10) 



where, L = 5 x 10 6 fcm is the arm length of LISA, SP° s (f) = 4 x 10- 22 m?/Hz and 
S® cc (f) = 9 x 10- 30 m 2 /s 4 /i?2; are the position and acceleration noise respectively. 

The quantity /* = 1/ (2itL) is the mean transfer frequency for the LISA arm. 
To the above formula, a random gaussian is given as input, to generate the simulated 
instrumental - noise spectral density. 

The plenitude of unresolvable galactic binaries which constitute the galactic fore- 
ground produce a noise source which is called the confusion noise. This noise is can 
not be assumed to be stationary and Gaussian. The binaries which are bright enough to 
be individually resolved, are resolved and removed beforehand. To model the remain- 
ing galactic noise, Nelemans, Yungelson and Zwart (NYZ) came up with the following 



expression [19] [33 1 
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Figure 3.1: LISA'S noise curve. 



And so the total spectral density of noise is given by: S n (f) = S'^ is (f) + S^ nf (f). A 



plot of the noise is given in the Fig. 3.1 



3.5 The Gravitational Waveform 

The two harmonics of the wave can be written as: 

k 

h +tX (t) = ife^ffW (t)e«**-*, (3.13) 

where, 'k' denotes the order of the harmonic, (f) or b is the orbital phase and H+\ 
are the amplitude components associated with each harmonic. The strongest harmonic 
of the wave is the one corresponding to k = 2, given by the quadrupole moment of the 
source. So, in the restricted post-Newtonian approximation, we ignore all terms other 
than the quadrupole term for the amplitude, and expand the phase corrections of the 
wave up to the order of (v/c) 2n , ie n-PN order. Also, the approximation neglects the 
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phase modulation brought in by the other order amplitude corrections. In this approxi- 
mation, the GW polarizations are given by |TJ: 

= ^x[H^ x + xV*H?n + sff« + x^H^ + s"ff« ]. (3.14) 

C 1 s I 

Here, m = mi + m 2 , the total mass of the binary, r\ = mimj/m 2 is the re- 
duced mass ratio, and D L is the luminosity distance of the source. The post-Newtonian 
parameter x = (Gmco / c 3 ) 2 ^ 3 , where to = d& or b/dt. 



In Eqn 3.14 the include the post Newtonian corrections to the amplitude 
and the extra phase harmonics. But for our purpose, it suffices to work in the restricted 
PN approximation, which means including only if? terms. So, the GW polarizations 



are given by [24| 



= ^(l + coj 2 ^))^), (3.15) 
c 2 D L 

AGmr] 

h x = ip=—cos{L)xsm{<p). 

c 2 D L 

The inclination of the orbit of the binary is defined as cos(l) = L ■ n. 
In the adiabatic approximation, the evolution of the phase of the wave, and the 
instantaneous velocity is governed by 



dt mE'(v) 
d(j) 2v 3 
dt m 

Where, v = (irmf) 1 ^ 3 is the instantaneous velocity, E'(v) = dE/dv is the 
derivative of the orbital enerdy with respect to the velocity. F(v) is the gravitational 
wave flux function. For a test-mass particle in circular orbit around a Schwarzschild 



black hole, and exact expression for the orbital energy is given in [34], the derivative of 
which is given as: 
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E'(v) 



-T]V- 



1 - Qv 2 



(3.17) 



(1 - 3w 2 ) 3 / 2 

We can see that this equation gives an Energy extremum at v — 1/ v^6, which is 
also the velocity at the last stable orbit v tso . For the gravitational Flux function, whose 



Newtonian part was derived earlier in the Section [23} there is a 5.5PN expression [ |20 | 
(6J U29J U25) U30J [|32jl: 



F T (v) = F N (v 
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(3.18) 



F N (v) = 3 4v 2 v 10 . 
5 



(3.19) 
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Chapter 4 



Markov Chain Monte Carlo 



4.1 Introduction 

Stochastic search algorithms inspired by physical and biological systems are ap- 
plied to the problem of optimization in multiple dimensions, with multiple local optima. 
For this type of systems, greedy deterministic search algorithms tend to halt at local 
optimum, usually requiring random restarts to obtain solutions of acceptable quality. 
Stochastic search for constrained optimization can be viewed as a unifying paradigm 
for expressing the fundamental laws governing the behavior of natural systems. Physi- 
cal systems can be modeled as detecting local potential fields and seeking states of low 
free energy. Quantum systems can be modeled as choosing stochastically among local 



extrema of the action integral [27| 



The current problem at hand is essentially of maximizing a function (the log- 
Likelihood), in the Post Newtonian coefficient space. The non- stochastic search for the 
coefficients with the aim of maximizing over the log-Likelihood function, can prove to 
be extremely computationally intensive for this case. In the non- stochastic approach, 
the whole space is first divided into a grid, and then within each grid independent 
searches proceed. For example, if searching in two dimensions, the surface would be 
divided into a hexagonal grid (closest packing): 

In three dimensions, a FCC lattice is used, as it has the highest packing fraction. 
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Figure 4.1: hexagonal grid for 2-D non-stochastic search 



But, as we proceed to higher dimensions, the most efficient packing formations itself 
are not known - which is the basic information on the basis of which the space is covered 
by a grid. In higher dimensions, due to the size of the search space, the non- stochastic 
algorithm would definitely not be computationally very feasible. Hence, the stochastic 
search methods were explored for this particular problem. 

Natural systems provide a rich source of analogies for constructing efficient ap- 
proaches to complex search, optimization, and learning problems. Any problem of free 
energy minimization can be recast as an optimization or statistical inference problem. 
The energy and energy states of the system are interpreted as the objective function to 
be minimized or the probability distribution to be simulated. 

A fairly popular approach applied to derive an approximate solution to the target 
problem is the Markov Chain Monte Carlo, or MCMC. For instance, applying MCMC 
to a mechanical system, a stochastic simulation of the system is constructed in which 
the long run frequency with which each solution is visited is given by the distribution 
that minimizes free energy, ie the Boltzmann distribution. Because low energy states 
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have been defined as good according to the target objective function or probable ac- 
cording to the target distribution, the simulated system evolves over time to spend more 
of its time at good solutions of the target problem. 

There exists a considerable literature documenting successful application of MCMC 



methods to a wide variety of problems [e.g., [ 1 1 1]. A variety of MCMC samplers have 
been constructed for any given problem by varying the sampling distribution subject 
to the local reversibility conditions that ensure convergence to the optimal distribution 
distribution. 

Although the long-run frequency distribution is identical for any MCMC sampler 
satisfying the ergodicity conditions, different samplers on the same surface can vary 
widely in their dynamic behavior. Especially the speed with which a sampler reaches 
the optima and the ease with which it escapes local hills or valleys. Randomized restarts 
help to avoid the tendency to become stuck on a local peak, but lack of mobility remains 
a problem for highly complex and multimodal surfaces. 

Various approaches have been proposed to improve performance of MCMC sam- 
plers. If global information is available about the surface, it can be used to inform sam- 
pling and thus increase efficiency. For example, samplers can be made by hybridising 
Differential Evolution and Markov Chain Monte Carlo [ 28 ] [2], using a population of 
MCMC samplers to assess the variability in results from different runs of the sampler. 
It has been suggested that multiple parallel runs might provide an opportunity to im- 
prove performance by exchanging information among solutions Q. However, due to 
computational intensity, these approaches were found impractical for the task at hand. 



4.2 A Little background 

Let us imagine that we need to calculate the expected value of a function <f)(x), 
where x is a random vector of dimensionality k, with a probability density f(x). Then, 
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this expected value is given as (taking the reasonable assumption < oo ): 

E[<f>(x)] = [ <j>(x)f(x)dx (4.1) 

In the case where analytical integration is not feasible, and numerical integration 
is tedious, the Monte Carlo method is employed. It involves generating a sequence of 
random vectors x n with the same distribution f(x), and the strong law of large numbers 
gives: 

1 n 

E[<t>{x)\ = lim - Y<j>(xi) (4.2) 

i=l 

as the estimate for the expected value. 

But what if sampling from f(x) is practically difficult or infeasible. At this 
point the Markov Chain Monte Carlo comes into the picture. For simplicity, consider a 
system with a finite number of possible states, x\, x 2 , x 3: . . . x n . To each state, assign a 
probability, Pi = f(xi). 

Suppose we have to calculate the expected value of (j>(x), as above. Therefore, 

n 

E[<P]=Y J <P(xi)f(xi) (4-3) 

The probability of transition to a particular state, depends only on the immedi- 
ately preceding state. This identifies the chain as Markovian. Formally, 

P(x(t) = Xj\x(t — 1) = Xji, x(t — 2) = Xj2, ■ ■ ■ ) = P(x{t) = Xj\x(t — 1) = Xi) = Pij 

(4.4) 

The matrix P with elements is called the Markov matrix. 

A state j is said to be accessible from a different state i (i — >■ j) if, given that we 
are in state i, there is a non-zero probability that at some time in the future, the system 
will be in state j. Formally, state j is accessible from state i if there exists an integer 
n > such that 

P(x n = j\x = i) >0 (4.5) 

A state i is said to communicate with state j if it is true that both i is accessible from 
j and that j is accessible from i. A set of states C is a communicating class if every 
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pair of states in C communicates with each other, and no state in C communicates 
with any state not in C. A Markov chain is said to be irreducible if its state space is a 
communicating class; this means that, in an irreducible Markov chain, it is possible to 
get to any state from any state. 

Finally, a markov chain is ergodic in a State Space if all of the space is irreducible 
with respect to the chain. And there exist numbers <pi > 0, J2i n i = 1> an< ^ 71 'j = 



4.3 Simulated Annealing [16] [39] 



Simulated annealing (SA) is a generic probabilistic meta-algorithm for the global 
optimization problem, namely locating a good approximation to the global maximum 
of a given function in a large search space. It is often used when the search space is 
discrete provided that the goal is to find an acceptably good solution in a fixed amount 
of time, rather than the best possible solution. 

The name and inspiration come from annealing in metallurgy, a technique that 
involves controlled cooling of a pre-heated material to increase the size of its crystals 
and reduce their defects. The heat causes the atoms to become unstuck from their initial 
positions (a local minimum of the internal energy) and wander randomly through states 
of higher energy; the slow cooling gives them more chances of finding configurations 
with lower internal energy than the initial one. The distribution of states, as a function 
of Energy, is given by the Boltzmann distribution: 

f E = 2(^—)e^T (4.6) 



Obviously, as the T increases, the probability of more particles being at a higher value 
is E increases. That means, that the particles explore the whole energy band completely 
(in other words, are distributed in a wide region of the energy band). Now, for the 



stochastic search, the MetroPolis Hastings ratio [Eq. 4.8 1 is raised to the power 1/T. 
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This lowers the peaks on the H surface too. 
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Figure 4.2: the Likelihood peaks, as the temperature increases 

By analogy with the physical process, each step of the Simulated Annealing al- 
gorithm replaces the current solution by a random "nearby" solution, chosen with a 
probability that depends on the difference between the Likelihood estimater values and 
on a global parameter T (called the temperature), that is gradually decreased during 
the process. The net affect is that the peaks are lowered and broadened, as shown in 
the figure above. The dependency is such that the current solution changes almost ran- 
domly when T is large, but increasingly "uphill" as T goes to zero. The allowance for 
"downhill" moves saves the method from becoming stuck at local maxima which is 
the prime negative point against greedy algorithms. A cooling schedule defines how 
the temperature falls. It has to be adjusted to allow enough time for the initial burn-in 
phase, as the option of De-initializing of the chain p2| is not availed. 

The importance of the cooling schedule can not be over-emphasised. Suppose 
that the temperature is decreased too rapidly. Then, if the chain gets once stuck at a 
secondary maximum, it would not even get time to come out of it, and the temperature 
would have fallen already. That would mean that the probability of the chain getting out 
of the secondary maximum would fall sharply with time. And, if the temperature falls 
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too slowly, then the chain has just too much time at each peak, including the global 
maxima, and it might just wander off. Thus, the way the fall in the temperature is 
regulated changes a lot in the destiny of the chain. 

The method that is used here, is an adaptation of the Metropolis-Hastings al- 



gorithm [13 1, a Monte Carlo method to generate sample states of a thermodynamic 



system, invented Metropolis et al. 1953. 



4.3.1 Thermostated Annealing [14| 



A slight modification of Simulated Annealing works better, in case where high 
Temperature values are needed to keep the chain from getting stuck, at all times. This 
goes by the indicative name of thermostated annealing. 

When the temperature falls below a certain predecided threshold, the Tempera- 
ture of the system is increased and levelled off. The threshold is fixed in terms of the 



highest SNR attained till that point. As the logLikelihood 3.8 is related to the Signal to 
Noise ratio, by InL ps ^SNR 2 , this relation readily translates into one in terms of the 
log-Likelihood. This injection of heat ensures that the highest possible temperature is 
used, which would help the chain to move out of local maxima. As raising to higher 
temperature would cause lowering of the peaks, the movement about the surface would 
be consequently easier. 

4.4 The Implementation 

Let A m denote the set of the coefficients of v\ and B n denote the set of the 



coefficients of \a.{v)v l in the 5.5PN expansion of the gravitational Flux function 3. 18 



that are included in the search space (the indexing is done backwards). This means, 



that if m — 2, n — 1, then last m(=2) a' sj3.18| and last n(=l) b' sj3.18| are included in the 
search space. 



36 



The values of m and n are changed, and the results were observed. The under- 
lying implementation of MCMC remained more or less the same. So, the search is 
always in a m + n dimensional space. 

4.4.1 The Algorithm 

The algorithm proceeded as follows: 
Let us define a vector 

x = a\2~i, h\2-j] l<i<m, I < j <n (4.7) 

Step 1: An arbitrary starting point is chosen for x. 
..iteration begins.. 

Step 2: A draw is made for the displacement in a single dimension of the search 
space, from a zero - mean Gaussian proposal distribution. (As all the coefficients in 
the PN expansion are non-correlated, we can take jumps in one dimension at a time, 
cyclically for all 'elements of x). The new point, generated by adding the jump to Xi, 
where i changes cyclically over [1, m + n) with each iteration, i.e. y = {xi, . . . , Xj + 
draw, . . .} 



Step 3: The Metropolis - Hastings ratio p3| : 

H -- 



^{y)p{s\y)q{x\y)] 1/T 



(x)p(s\x)q(y\x) 

is evaluated. Here, ir(x) are the priors of the coefficients, p(s\x) is the likelihood of x 
denoting the true value of coefficients, given by comparing the signal evolved using x, 
to the signal given by the known 5.5PN approximation ^ . 7 1 And the last term, q(y\x) is 
the proposal distribution. 

As a symmetric sampling distribution is used, q(x\y) = q(y\x). 

Step 4: The jump is taken, with a probability of mm(l, H). 

Step 5: The temperature is configured for the next iteration. The cooling schedule 
that is chosen works as follows: 
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Lets say that the chain runs iV times, and initial temperature be T . Then nheat (< 
N) is chosen, which would determine the number of steps for which the surface would 
be heated. Let k denote the index of iteration, then the temperature is decided by: 

if(k < nheat) 

{ 



heat = Tq nheat ' 
if (heat < T thresh )heat = T thresh 
Ti a st = heat 
} 

else if(nheat < k < N) 
{ 

heat = T^^^ 
} 

else 

heat = 1.0 

and, 

T thresh = SNR 2 /100 



Then, back to Step 2. 



4.4.2 The Approach 

The first step was to start with a search for m = 1, n = 0, keeping all the other 
coefficients fixed at their original values. The true signal was taken to be the 5.5PN one. 
Then the dimensionality of the search was increased in steps of 1, alternately increasing 
m and n by 1 . 

This was done till the Chains were able to successfully converge at the true val- 
ues. Then this is repeated, including noise in the original signal s. It is important to 
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understand the role played here by the cooling schedule here. The surface plot / con- 
tour plots for the surface were generated, to help with the cooling schedule. Initially 
a simple schedule was introduced, where the temperature fell exponentialy to 1, over 
the whole length of the chain. Then another schedule was implemented, in which the 
temperature fell exponentially to 1 by 15000 steps, and the last 5000 steps were at a 
constant temperature of 1, and this was done for the chain to search locally (around 
the point it would be at at the end of the 15000 steps), and yield more accurate results. 
Also, in this step the number of steps in the chain was increased. Then a schedule was 
implemented such that the temperature exponentially fell, but never below a particu- 
lar threshold. This was done after inspecting the contour plots, which showed a row 
of maxima and it was expected that the chain would get stuck in them. So, a higher 
temperature would be required always for it to be freed. 

The important fact about the sampling scheme is, that it produces a Markov chain 
with a stationary distribution equal to the posterior distribution of interest, p(x\s), re- 



gardless of the choice of proposal distribution [ 1 1 1, although a poor choice of the pro- 
posal distribution would result in the algorithm taking a very long time to converge to 



the stationary distribution (known as the burn — in time). The jumps are scaled |23| 



by the square-root of the heat. Since the chain is not de-initialized [22 1, elements of 
the Markov chain produced during the burn — in phase have to be discarded as they 
do not represent the stationary distribution. Also, as the dimensionality of the search 
space increases, this burn — in time can be very long. 

If the search parameters are correlated, the chain is not very efficient in exploring 
the whole search space, but in this case as the coefficients are all independent, the chain 
was expected to have explored the search space thoroughly. 



Chapter 5 



Results 



The sources which are expected to be observed can be typically divided in to 
two distinct groups. First, where we see coalescence and second, where we dont. 
For the purpose here, either of them would do, as the more engaging matter here is 
about the accuracy of the PN-template available. So, a system where coalescence is 
not observed is chosen. A list of the parameters of the system that was chosen follows: 



Parameters 


Values 


TOi/M 


5 x 10 6 


m 2 /M. 


10 


9/rad 


0.6842 


(fi/rad 


2.5791 


tjyr 


0.458333 


ijrad 


0.9273 


ip/rad 


1.4392 


z 


10 


D L /Gpc 


io- 1 


ip c /rad 


0.3291 


SNR 


~ 40 
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The first step was to start with a search for a u , keeping all the other coefficients 
fixed at their original values. For this search, the one-dimensional surface was plotted, 



and Figure 5.1 was obtained. Also, at the same time, the one-dimensional surface was 



plotted for 6 n [Figure 5.2 1. The purpose of plotting these was to make sure that the 



starting point of the chain is not chosen too close. Also, to predict the heat required in 
the beginning could be estimated better with the help of these. It is noteworthy, that the 
true Signal was taken without the noise. So, this was really a test run per se. 
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Figure 5.1: Loglikelihood v/s coefficient a% 



i- 



As these curves are pretty smooth, and not too steep, it was predictable that the 
Markov Chain should be able to converge on the true value. The SNR ranged around 
600, which is also a pretty high value. Although, of course, the original signal being 
deviod of noise, the term Signal-to-Noise Ratio does not hold much relevance. The 
results from the Markov Chains was extremely rapid convergence. This is given in the 
Fig ]5.3| and Fig ]5.4[ The constant blue horizontal line denotes the true value (known) 
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Figure 5.2: Loglikelihood v/s coefficient bu. 



of the coefficient. 

As the chains converged at the true value of an and bu, the most obvious next 
thing to do would be to check if the same result is obtained in presence of noise in the 
original Signal. 




Figure 5.4: Plot of the chain searching for bu in absence of noise. 
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Figure 5.5: Plot of the chain searching for a u in presence of noise. 



This was implemented [Fig. 5.5 & 5.7 1, and in presence of noise too, the chains 
succeeded in converging at the true value. Even with inclusion of noise, the chains 
were observed to still converge extremely rapidly. Notable is, that, the starting value 
was always chosen randomly (using a random number generator to choose from a range 
of values, of course). 



Also, the posterior was plotted for these two PN coefficients [Fig. 5.6 1. 
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Figure 5.6: Posterior for an 

Given, the success of Markov Chains in searching for an and bn, with all other 
coefficients taken as known; the next idea was to see how the chains fare in searching 
for aio and 610 (individually), with no knowledge of the coefficients of 5.5PN <fe term. In 
other words, searching for aio & b w , with a n = 0.0 and bu = 0.0. In this, the attempt 
essentially was to take a waveform till lower PN approximation, and try to find the 
next PN coefficient, matching against a given higher Post-Newtonian approximation 
waveform. 

However, this search did not turn out positive results and several initial heats were 
tried out. The closest the chains gets, in the multitude of runs performed, is shown in 



Fig. 5.8 & 5.9 These searches were carried out with noisy original signal. 
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Figure 5.7: Plot of the chain searching for 6 U in presence of noise. 
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Figure 5.8: Plot of the chain searching for a 10 in presence of noise, and with a n 
bn = 0. 
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Figure 5.9: Plot of the chain searching for b w in presence of noise, and with an 
bn = 0. 



47 



The next step was to increase the dimensionality of the search space to 2, and 
search for an and 6 n simultaneously (again keeping all the other coefficients of the 
5.5PN expansion at their true values). Before that, the logLikelihood surface was plot- 



ted as a heat map for a n against b n [Fig. 5. 10 1 . 
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Figure 5.10: LogLikelihood surface for the {a n , 6 n } space, plotted as a^v/shu 



With this, the Chains were set running to explore the 2-dimensional surface and 
search for the true values of an & b\\. The signal was initially taken with noise, and 



subsequently without it. The heat map [Fig. 5. 10 1 shows that there is a row of peaks 



running diagonal across the map, and Fig. 5.1 & 5.2 were cross sections of this sur- 
face. This row of peaks means that it was going for the Chains to explore the surface 
completely and find the true maxima. The purpose of this search was to see, that if we 
knew the waveform to 5PN approximation, and the true signal be approximated by the 
5.5PN waveform, then whether Markov Chains find the 5.5PN" 1 coefficients. 



Fig. 5.1 1 and 5.12 show a very typical result for this case. All the chains set out, 
with different initial heats, and different initial values, ended up in random walk like 
this. 



48 



1020 



1010 



1000 



990 



980 



970 



960 



950 




1 1.5 
index of iteration 



x 10 



Figure 5.11: Chain searching for an (and 6 n ), eventually random- walking. 
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Figure 5.12: Chain searching for 6 n (and an), eventually random- walking. 
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Important to note here [Fig. |5.13[, is the value of the local maxima it got stuck 



at. The value is logL = 2.244 x 10 3 , which differs only by 0.31% from the logL value 
for a perfect match (= 2.237016 x 10 3 ). 
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Figure 5.13: LogLikelihood values for the Chain searching for a n & 6 n , using the true 
signal with noise. 



Another attempt at the same was made, with taking the signal without noise. 
The results are given in the figures 5.14 and 5.15 Not surprisingly, the results did not 
improve, and the chain repeated its behaviour by wandering away from the true value. 




Figure 5.15: Chain searching for b n (and a n ), eventually random- walking. The origi- 
nal signal used here is without noise. 



51 



Further, search chains were also setup for searching for coefficients in 3 dimen- 



sions. Which means, m = 2 & n = 1 [4.7 1. The true signal was taken as the 5.5Post 



Newtonian approximation, without noise. These chains showed the following result: 
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Figure 5.16: Chain searching for a u (& bn, a\o), using the true signal without noise. 
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Figure 5.17: Chain searching for b u (& a u , a w ), using the true signal without noise. 
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Figure 5.18: Chain searching for a 10 (& a U} b u ), using the true signal without noise. 



Chapter 6 



Conclusion 



In this thesis, the main idea dealt with is to search for post-Newtonian coeffi- 
cients, for the gravitational flux radiated from Massive Black Hole Binary Inspirals - 
in the test mass limit. The search was done using stochastic search, employing the 



Markov Chain Monte Carlo method [ 13 1 [ 16 1. The idea germinated from the fact that 
to come up with more accurate expressions for the gravitational flux, higher order post 
- Newtonian approximation would be required. And MCMC is a promising way to go 
about it. 

Also, when worked out against actual data from LISA, the coefficients that the 
Markov Chains come up with can be compared with those predicted by the post - New- 
tonian theory. This would actually work out as a test for General Relativity, and a 
verification of its predictions. The approach was tried against existing 5.5 order post - 



Newtonian expression with the mathematical model for the LISA noise [ 15 1 [ 19 1 [33 1, 
and the chains managed to find the last coefficients extremely rapidly. Also mention- 
able is that the values they settled at, were within 1/500^ of a percent of the true values. 



This is illustrated in the folowing histogram plots, for a n and 6 1X [Eq. 3.18 1 . 
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Figure 6.1: Distribution of the values taken by the chain for a n in presence of noise. 
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Figure 6.2: Distribution of the values taken by the chain for bu in presence of noise. 
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However, the searches attempted beyond trying to search for the last coefficient, 
did not yield positive results. Several cooling schedules [Sec 4.4.2J and starting temper- 
atures were experimented with. However, the chains failed to yield persistent results. 
Theoretically pondering, this should have worked, but it did not. On investigating for 
the reason, it was found in the plenitude of local maxima found in the neighborhood 
of the global maxima. This is clearly illustrated in a surface plot of the log-Likelihood 



[Fig 6.3 1 (which was used as the indicator to the match of the signal evolved using the 
predicted coefficients and the ideal signal) against the last two coefficient - dimensions: 



x10' 




Figure 6.3: Log-Likelihood surface, in the space of the last two coefficients (a n and 
&u). 

It is sufficiently clear that there is a row of maxima running across the plane of 
the two dimensions. The chain kept getting stuck on these maxima, which were found 
to be as close as within 0.25 % of the true perfect-match log-Likelihood. However this 
is to be treated as a preliminary study of this approach and in future, the work is hoped 
to be extended to searching for physical parameters along-with the post-Newtonian 
coefficients. It would be even more interesting to see if this approach works when the 
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physical parameters of the system are not fixed themselves (as they were, throughout 
here). The idea is, to have MCMC to search for the parameters like the sky position 
or the masses, after an initial preliminary search and narrowing down of the candidates 
has been accomplished, along-with searching for the Post-Newtonian coefficients. The 
coefficients found can be matched against the theoretically predicted values, and the 
comparison can be well treated as a test for the PN theory, and that of GR itself. 
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